Assessment Part 1

Written Evaluation

Word count: 600

This research selected 2015 sub-region data from Afghanistan, reported by the Department of Health Surveys (Spatial Data Repository, n.d.). With this data, the spotlight was shone on the potential spatial correlation between female literacy and household access to electricity, which is considered a measure of modernisation and development (Desai, 2012). QGIS and R were used to explore, visualize and compare the female literacy rates and household access to electricity, across sub-regions in Afghanistan. It is critical to remind readers that this study does not intend to prove causation but is solely intended to visualize a potential spatial correlation.

This study began by using QGIS to read shapefiles downloaded from the Spatial Data Repository. However, there were many variables/fields with incomplete data. These fields include the maternal mortality rates and HIV-prevalence indicators. Missing values were indicated by ‘9999’. Since the fields of interest (female literacy: ‘EDLITRWLIT’, household electricity: ‘HCELECHELC’) were complete, no interpolations were required to fill in missing data. Following the data inspection, 2 identical maps were placed in the workspace to produce a choropleth map visualising two variables. To ensure that the map clearly portrays the correlation in question, the sub-regions with less than 50% of households with electricity were highlighted through rule-based graduated symbology. Bins of observations were created with the Jenks natural breaks classification. Transparencies and symbols were adjusted and the QGIS map was generated, with appropriate map elements. Finally, the map was embedded into Rmarkdown (Dennett, 2018a).

The initial R attempt to produce the map utilised the ‘tmap’ R-package. However, the end-product was considered to be unsatisfactory on aesthetic and intuitive levels. Making fine adjustments required an in-depth knowledge of the extensive R-package documentation. After consulting with peers and an appreciation of the ‘leaflet’ package through Dennett (2018b), a second attempt was made to maximise the potential of the ‘leaflet’ package. Bins of observations were determined by calculating quantiles instead of jenks, and other map elements were added to produce a more comprehensive and intuitive map. This created an interactive map with pop-up labels and multiple overlays by using the ‘addLayersControl’ function (“Leaflet for R - Show/Hide Layers,” n.d.). This allows the user to select the variable to be visualised, to zoom-in, and to choose the type of basemap (ESRI gray canvas / topographic map).

With the maps complete, it is crucial to evaluate the limitations and uncertainty of the data used. Although we do not have the data collection methodology, we can speculate that one probable limitation is the geo-political situation within Afghanistan which could lead to the under-reporting of female literacy rates, as well as compromise the accuracy because of security risks in data collection. These uncertainties should be considered in further analysis.

The process of using two different approaches to produce a map researching the same question has brought a keener understanding of the benefits and drawbacks of each approach. The ever-necessary process of cleaning, processing and analysing data is significantly easier with R-packages. However, minute adjustments in producing a map are easier in QGIS because of the intuitive and friendly user-interface, which ensures a smooth navigation and provides immediate updates with each change. For R, the smallest of changes require the cartographer to be aware of the intricate R documentation for each function within each package, before needing to run chunks of code again. This can be a time-consuming endeavour, albeit becoming more intuitive with experience and practice. Overall, there are advantages and disadvantages with each platform, but these serve to emphasise their complementarity. The choice of program should depend on the structure of the data, the cartographer’s task, experience and preference.

QGIS Map

R Leaflet Map

---
title: "Assessment Part 1"
output: 
  html_notebook:
    theme: cerulean
  word_document: default

author: Andrew Wong Weng Seng
bibliography:
  Assessment1references.bib
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r pressure, echo=FALSE}
plot(pressure)
```

## Assessment Part 1 {.tabset}

### Written Evaluation 

Word count: 600

This research selected 2015 sub-region data from Afghanistan, reported by the Department of Health Surveys (Spatial Data Repository, n.d.). With this data, the spotlight was shone on the potential spatial correlation between female literacy and household access to electricity, which is considered a measure of modernisation and development (Desai, 2012). QGIS and R were used to explore, visualize and compare the female literacy rates and household access to electricity, across sub-regions in Afghanistan. It is critical to remind readers that this study does not intend to prove causation but is solely intended to visualize a potential spatial correlation. 

This study began by using QGIS to read shapefiles downloaded from the Spatial Data Repository. However, there were many variables/fields with incomplete data. These fields include the maternal mortality rates and HIV-prevalence indicators. Missing values were indicated by ‘9999’. Since the fields of interest (female literacy: ‘EDLITRWLIT’, household electricity: ‘HCELECHELC’) were complete, no interpolations were required to fill in missing data. Following the data inspection, 2 identical maps were placed in the workspace to produce a choropleth map visualising two variables. To ensure that the map clearly portrays the correlation in question, the sub-regions with less than 50% of households with electricity were highlighted through rule-based graduated symbology. Bins of observations were created with the Jenks natural breaks classification. Transparencies and symbols were adjusted and the QGIS map was generated, with appropriate map elements. Finally, the map was embedded into Rmarkdown (Dennett, 2018a).

The initial R attempt to produce the map utilised the ‘tmap’ R-package. However, the end-product was considered to be unsatisfactory on aesthetic and intuitive levels. Making fine adjustments required an in-depth knowledge of the extensive R-package documentation. After consulting with peers and an appreciation of the ‘leaflet’ package through Dennett (2018b), a second attempt was made to maximise the potential of the ‘leaflet’ package. Bins of observations were determined by calculating quantiles instead of jenks, and other map elements were added to produce a more comprehensive and intuitive map. This created an interactive map with pop-up labels and multiple overlays by using the ‘addLayersControl’ function (“Leaflet for R - Show/Hide Layers,” n.d.). This allows the user to select the variable to be visualised, to zoom-in, and to choose the type of basemap (ESRI gray canvas / topographic map). 

With the maps complete, it is crucial to evaluate the limitations and uncertainty of the data used. Although we do not have the data collection methodology, we can speculate that one probable limitation is the geo-political situation within Afghanistan which could lead to the under-reporting of female literacy rates, as well as compromise the accuracy because of security risks in data collection. These uncertainties should be considered in further analysis.

The process of using two different approaches to produce a map researching the same question has brought a keener understanding of the benefits and drawbacks of each approach. The ever-necessary process of cleaning, processing and analysing data is significantly easier with R-packages. However, minute adjustments in producing a map are easier in QGIS because of the intuitive and friendly user-interface, which ensures a smooth navigation and provides immediate updates with each change. For R, the smallest of changes require the cartographer to be aware of the intricate R documentation for each function within each package, before needing to run chunks of code again. This can be a time-consuming endeavour, albeit becoming more intuitive with experience and practice. Overall, there are advantages and disadvantages with each platform, but these serve to emphasise their complementarity. The choice of program should depend on the structure of the data, the cartographer’s task, experience and preference. 


### QGIS Map

```{r Embedding Map generated by QGIS, echo=FALSE}
library(knitr)
knitr::include_graphics('afghan_latest.jpg')
```

```{r R-generation of Map, include=FALSE}
library(plotly)
library(maptools)
library(RColorBrewer)
library(classInt)
library(OpenStreetMap)
library(rJava)
library(sp)
library(rgeos)
library(tmap)
library(tmaptools)
library(sf)
library(rgdal)
library(geojsonio)
library(ggplot2)
library(shiny)
library(shinyjs)
library(dplyr)
library(tidyverse)
library(maps)
library(leaflet)
library(sf)
library(sp)
library(magrittr)
library(prettydoc)
library(citr)

##### Reading shapefile used in QGIS into R ####
afghanmap <- read_shape("sdr_subnational_data_dhs_2015.shp", as.sf = TRUE)
afghan_lowelec <- afghanmap %>% filter(HCELECHELC<50)

# Making maps
afghan_osm <- read_osm(afghanmap, type = "esri", zoom = NULL)

```

```{r Generation of map with TMAP, eval=FALSE, include=FALSE}
##### Plotting with TMAP -- I did not find the map produced with this package very aesthetic and intuitive. Thus, I tried Leaflet in the next section instead.  #####
tmap_mode("plot")
tm_shape(afghanmap) +
  tm_polygons("EDLITRWLIT",
             style = "jenks",
             palette = "YlOrBr",
             midpoint = NA,
             border.col = "black",
             border.lwd = 0.5,
             title = "Female literacy (%)") +
  tm_text("DHSREGEN", size = 0.4, col = "black") +
tm_shape(afghan_lowelec) +
  tm_fill("HCELECHELC", alpha = 1, title = "Households with electricity") +
  tm_borders(col="black",lwd=2.0) +
  tm_text("DHSREGEN", size = 0.4, col = "black")
```

### R Leaflet Map

```{r Generation of map with Leaflet, echo=TRUE}
##### Plotting with LEAFLET #####
library(leaflet)
afghanmapSP <- afghanmap %>%
  st_transform(crs = 4326) %>%
  as("Spatial")
afghan_lowelecSP <- afghan_lowelec %>%
  st_transform(crs = 4326) %>%
  as("Spatial")

breaks<-classIntervals(afghanmap$EDLITRWLIT, n=5, style="quantile")
breaks <- breaks$brks

breaks1 <- classIntervals(afghan_lowelec$HCELECHELC, n=5, style="quantile")
breaks1 <- breaks1$brks

pal <- colorBin(palette = "OrRd", 
                domain = afghanmapSP$EDLITRWLIT,
                bins = breaks,
                reverse = TRUE)

pal1 <- colorBin(palette = "Greys",
                 domain = afghan_lowelecSP$HCELECHELC,
                 bins = breaks1,
                 reverse = TRUE)

leaflet() %>%
  addPolygons(data = afghanmapSP, 
              stroke = FALSE,
              fillOpacity = 0.8,
              smoothFactor = 0.5,
              fillColor = ~pal(EDLITRWLIT),
              popup = ~DHSREGEN,
              group = "Female Literacy rates"
  ) %>%
  addLegend(data = afghanmapSP,
            "bottomright",
            pal = pal,
            values = ~EDLITRWLIT,
            title = "Female Literacy rates (%)",
            opacity = 1,
            group = "Female Literacy rates") %>%
  addProviderTiles("Esri.WorldGrayCanvas", group = "Light basemap") %>%
  addProviderTiles("Esri.WorldTopoMap", group = "Topo basemap") %>%
  addPolygons(data = afghan_lowelecSP,
              stroke = TRUE,
              color = "black",
              weight = 3,
              fillOpacity = 1,
              smoothFactor = 0.5,
              fillColor = ~pal1(HCELECHELC),
              popup = ~DHSREGEN,
              group = "Sub-regions with <50% Households with electricity"
  ) %>%
  addLegend(data = afghan_lowelecSP,
            "topright",
            pal = pal1,
            values = ~HCELECHELC,
            title = "Sub-regions with <50% Households with electricity",
            opacity = 1,
            group = "Sub-regions with <50% Households with electricity") %>%
  addLayersControl(
    baseGroups = c("Light basemap", "Topo basemap"),
    overlayGroups = c("Female Literacy rates","Sub-regions with <50% Households with electricity"),
    options = layersControlOptions(collapsed = FALSE)
  )
```

